{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kSNEB-3orJ9C"
   },
   "source": [
    "# **Sentiment Analysis: Sparse Transfer Learning with the Python API**\n",
    "\n",
    "In this example, you will fine-tune a 90% pruned BERT model onto the SST2 dataset using SparseML's Hugging Face Integration.\n",
    "\n",
    "### **Sparse Transfer Learning Overview**\n",
    "\n",
    "Sparse Transfer Learning is very similiar to the typical transfer learning process used to train NLP models, where we fine-tune a pretrained checkpoint onto a smaller downstream dataset. With Sparse Transfer Learning, however, we simply start the training process from a pre-sparsified checkpoint and maintain sparsity while the fine-tuning occurs.\n",
    "\n",
    "At the end, you will have a sparse model trained on your dataset, ready to be deployed with DeepSparse for GPU-class performance on CPUs!\n",
    "\n",
    "### **Pre-Sparsified BERT**\n",
    "SparseZoo, Neural Magic's open source repository of pre-sparsified models, contains a 90% pruned version of BERT, which has been sparsified on the upstream Wikipedia and BookCorpus datasets with the\n",
    "masked language modeling objective. [Check out the model card](https://sparsezoo.neuralmagic.com/models/nlp%2Fmasked_language_modeling%2Fobert-base%2Fpytorch%2Fhuggingface%2Fwikipedia_bookcorpus%2Fpruned90-none). We will use this model as the starting point for the transfer learning process.\n",
    "\n",
    "\n",
    "**Let's dive in!**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Y0WybTbssU0g"
   },
   "source": [
    "## **Installation**\n",
    "\n",
    "Install SparseML via `pip`.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "id": "AkR1u2_NnXqY"
   },
   "outputs": [],
   "source": [
    "!pip install sparseml[transformers]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_jY0SKdXFGO3"
   },
   "source": [
    "If you are running on Google Colab, restart the runtime after this step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "XXj0S5Jdq2M-"
   },
   "outputs": [],
   "source": [
    "import sparseml\n",
    "from sparsezoo import Model\n",
    "from sparseml.transformers.utils import SparseAutoModel\n",
    "from sparseml.transformers.sparsification import Trainer, TrainingArguments\n",
    "import numpy as np\n",
    "from transformers import (\n",
    "    AutoModelForSequenceClassification,\n",
    "    AutoConfig, \n",
    "    AutoTokenizer, \n",
    "    EvalPrediction, \n",
    "    default_data_collator\n",
    ")\n",
    "from datasets import load_dataset, load_metric"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "2rS2Q5kxFcW3"
   },
   "source": [
    "## **Step 1: Load a Dataset**\n",
    "\n",
    "SparseML is integrated with Hugging Face, so we can use the `datasets` class to load datasets from the Hugging Face hub or from local files.\n",
    "\n",
    "[SST2 Dataset Card](https://huggingface.co/datasets/glue/viewer/sst2/test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "nT8RoT-yGFxy"
   },
   "outputs": [],
   "source": [
    "# load dataset natively\n",
    "dataset = load_dataset(\"glue\", \"sst2\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "_zd5iVVHRAFL"
   },
   "outputs": [],
   "source": [
    "print(dataset)\n",
    "print(dataset[\"train\"][0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "CRBqBYXvGdpl"
   },
   "outputs": [],
   "source": [
    "# alternatively, save as csv and reload\n",
    "dataset[\"train\"].to_csv(\"sst2-train.csv\")\n",
    "dataset[\"validation\"].to_csv(\"sst2-validation.csv\")\n",
    "\n",
    "data_files = {\n",
    "  \"train\": \"sst2-train.csv\",\n",
    "  \"validation\": \"sst2-validation.csv\"\n",
    "}\n",
    "dataset = load_dataset(\"csv\", data_files=data_files)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "aB8nezNJQ9Rz"
   },
   "outputs": [],
   "source": [
    "print(dataset)\n",
    "print(dataset[\"train\"][0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "_5kKXKmHGrQm"
   },
   "outputs": [],
   "source": [
    "!head sst2-train.csv --lines=5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "zM4GG9iMGvDA"
   },
   "outputs": [],
   "source": [
    "!head sst2-validation.csv --lines=5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3BfXUE9HHFoq"
   },
   "source": [
    "## **Step 2: Setup Evaluation Metric**\n",
    "\n",
    "Sentiment analysis is a single sequence binary classification problem. We will use `accuracy` (% of correct predictions) as the evaluation metric. \n",
    "\n",
    "Since SparseML is integrated with Hugging Face, we can pass a `compute_metrics` function for evaluation (which will be passed to the `Trainer` class below)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "PL8HbrQzHRCF"
   },
   "outputs": [],
   "source": [
    "metric = load_metric(\"glue\", \"sst2\")\n",
    "\n",
    "def compute_metrics(p: EvalPrediction):\n",
    "  preds = p.predictions[0] if isinstance(p.predictions, tuple) else p.predictions\n",
    "  preds = np.argmax(preds, axis=1)\n",
    "  result = metric.compute(predictions=preds, references=p.label_ids)\n",
    "  if len(result) > 1:\n",
    "      result[\"combined_score\"] = np.mean(list(result.values())).item()\n",
    "  return result"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "id": "1GEhYi53HoAH"
   },
   "source": [
    "## **Step 3: Download Files for Sparse Transfer Learning**\n",
    "\n",
    "First, we need to select a sparse checkpoint to begin the training process. In this case, we will fine-tune a 90% pruned version of BERT onto the SST2 dataset. This model is available in SparseZoo, identified by the following stub:\n",
    "\n",
    "```\n",
    "zoo:nlp/masked_language_modeling/obert-base/pytorch/huggingface/wikipedia_bookcorpus/pruned90-none\n",
    "```\n",
    "\n",
    "Next, we need to create a sparsification recipe for usage in the training process. Recipes are YAML files that encode the sparsity related algorithms and parameters to be applied by SparseML. For Sparse Transfer Learning, we need to use a recipe that instructs SparseML to maintain sparsity during the training process and to apply quantization over the final few epochs. \n",
    "\n",
    "In the case of SST2, there is a transfer learning recipe available in the SparseZoo, identified by the following stub:\n",
    "```\n",
    "zoo:nlp/sentiment_analysis/obert-base/pytorch/huggingface/sst2/pruned90_quant-none\n",
    "```\n",
    "\n",
    "Finally, SparseML has the optional ability to apply model distillation from a teacher model during the transfer learning process to boost accuracy. In this case, we will use a dense version of BERT trained on the SST2 dataset which is hosted in SparseZoo. This model is identified by the following stub:\n",
    "\n",
    "```\n",
    "zoo:nlp/sentiment_analysis/obert-base/pytorch/huggingface/sst2/base-none\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "JFoRllfCJnxE"
   },
   "source": [
    "Use the `sparsezoo` python client to download the models and recipe using their SparseZoo stubs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "zuUY-eu_IhW0"
   },
   "outputs": [],
   "source": [
    "# downloads pruned-BERT model\n",
    "model_stub = \"zoo:nlp/masked_language_modeling/obert-base/pytorch/huggingface/wikipedia_bookcorpus/pruned90-none\" \n",
    "download_dir = \"./model\"\n",
    "zoo_model = Model(model_stub, download_path=download_dir)\n",
    "model_path = zoo_model.training.path \n",
    "\n",
    "print(model_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "MNdDad8qKYo8"
   },
   "outputs": [],
   "source": [
    "%ls ./model/training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "HvJ9_wgyIkej"
   },
   "outputs": [],
   "source": [
    "# downloads transfer learning recipe\n",
    "transfer_stub = \"zoo:nlp/sentiment_analysis/obert-base/pytorch/huggingface/sst2/pruned90_quant-none\"\n",
    "download_dir = \"./transfer_recipe\"\n",
    "zoo_model = Model(transfer_stub, download_path=download_dir)\n",
    "recipe_path = zoo_model.recipes.default.path\n",
    "\n",
    "print(recipe_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "zbj7BXymIqtT"
   },
   "outputs": [],
   "source": [
    "%ls ./transfer_recipe/recipe/recipe_original.md"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "K0m0amQuKV2H"
   },
   "outputs": [],
   "source": [
    "# downloads teacher\n",
    "teacher_stub = \"zoo:nlp/sentiment_analysis/obert-base/pytorch/huggingface/sst2/base-none\"\n",
    "download_dir = \"./teacher\"\n",
    "zoo_model = Model(teacher_stub, download_path=download_dir)\n",
    "teacher_path = zoo_model.training.path "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "0-3z9FNJKeSF"
   },
   "outputs": [],
   "source": [
    "%ls ./teacher/training"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Inspecting the Recipe**\n",
    "\n",
    "Here is the transfer learning recipe:\n",
    "\n",
    "```yaml\n",
    "version: 1.1.0\n",
    "\n",
    "# General Variables\n",
    "num_epochs: &num_epochs 13\n",
    "init_lr: 1.5e-4\n",
    "final_lr: 0\n",
    "\n",
    "qat_start_epoch: &qat_start_epoch 8.0\n",
    "observer_epoch: &observer_epoch 12.0\n",
    "quantize_embeddings: &quantize_embeddings 1\n",
    "\n",
    "distill_hardness: &distill_hardness 1.0\n",
    "distill_temperature: &distill_temperature 2.0\n",
    "\n",
    "weight_decay: 0.01\n",
    "\n",
    "# Modifiers:\n",
    "\n",
    "training_modifiers:\n",
    "  - !EpochRangeModifier\n",
    "      end_epoch: eval(num_epochs)\n",
    "      start_epoch: 0.0\n",
    "  - !LearningRateFunctionModifier\n",
    "      start_epoch: 0\n",
    "      end_epoch: eval(num_epochs)\n",
    "      lr_func: linear\n",
    "      init_lr: eval(init_lr)\n",
    "      final_lr: eval(final_lr)\n",
    "\n",
    "quantization_modifiers:\n",
    "  - !QuantizationModifier\n",
    "      start_epoch: eval(qat_start_epoch)\n",
    "      disable_quantization_observer_epoch: eval(observer_epoch)\n",
    "      freeze_bn_stats_epoch: eval(observer_epoch)\n",
    "      quantize_embeddings: eval(quantize_embeddings)\n",
    "      quantize_linear_activations: 0\n",
    "      exclude_module_types: ['LayerNorm', 'Tanh']\n",
    "      submodules:\n",
    "        - bert.embeddings\n",
    "        - bert.encoder\n",
    "        - bert.pooler\n",
    "        - classifier\n",
    "\n",
    "\n",
    "distillation_modifiers:\n",
    "  - !DistillationModifier\n",
    "     hardness: eval(distill_hardness)\n",
    "     temperature: eval(distill_temperature)\n",
    "     distill_output_keys: [logits]\n",
    "\n",
    "constant_modifiers:\n",
    "  - !ConstantPruningModifier\n",
    "      start_epoch: 0.0\n",
    "      params: __ALL_PRUNABLE__\n",
    "\n",
    "regularization_modifiers:\n",
    "  - !SetWeightDecayModifier\n",
    "      start_epoch: 0.0\n",
    "      weight_decay: eval(weight_decay)\n",
    "```\n",
    "\n",
    "\n",
    "The `Modifiers` in the transfer learning recipe are the important items that encode how SparseML should modify the training process for Sparse Transfer Learning:\n",
    "- `ConstantPruningModifier` tells SparseML to pin weights at 0 over all epochs, maintaining the sparsity structure of the network\n",
    "- `QuantizationModifier` tells SparseML to quanitze the weights with quantization aware training over the last 5 epochs\n",
    "- `DistillationModifier` tells SparseML how to apply distillation during the trainign process, targeting the logits\n",
    "\n",
    "Below, SparseML's `Trainer` will parses the modifiers and updates the training process to implement the algorithms specified here."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "id": "FStnDScEKoMX"
   },
   "source": [
    "## **Step 4: Setup Hugging Face Model Objects**\n",
    "\n",
    "Next, we will set up the Hugging Face `tokenizer`, `config`, and `model`. \n",
    "\n",
    "These are all native Hugging Face objects, so check out the Hugging Face docs for more details on `AutoModel`, `AutoConfig`, and `AutoTokenizer` as needed. \n",
    "\n",
    "We instantiate these classes by passing the local path to the directory containing the `pytorch_model.bin`, `tokenizer.json`, and `config.json` files from the SparseZoo download."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "dJoAMnu-Lkod"
   },
   "outputs": [],
   "source": [
    "# shared tokenizer between teacher and student\n",
    "# see examples for how to use models with different tokenizers\n",
    "tokenizer = AutoTokenizer.from_pretrained(model_path)\n",
    "\n",
    "# setup configs\n",
    "model_config = AutoConfig.from_pretrained(model_path, num_labels=2)\n",
    "teacher_config = AutoConfig.from_pretrained(teacher_path, num_labels=2)\n",
    "\n",
    "model_kwargs = {\"config\": model_config}\n",
    "model_kwargs[\"state_dict\"], s_delayed = SparseAutoModel._loadable_state_dict(model_path)\n",
    "model = AutoModelForSequenceClassification.from_pretrained(model_path, **model_kwargs,)\n",
    "\n",
    "teacher_kwargs = {'config':teacher_config}\n",
    "teacher_kwargs[\"state_dict\"], t_delayed = SparseAutoModel._loadable_state_dict(teacher_path)\n",
    "teacher = AutoModelForSequenceClassification.from_pretrained(teacher_path, **teacher_kwargs,)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "K1JSDkCdMghS"
   },
   "source": [
    "## **Step 5: Tokenize Dataset**\n",
    "\n",
    "Run the tokenizer on the dataset. This is standard Hugging Face functionality."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "2WoYVUnAQKvm"
   },
   "outputs": [],
   "source": [
    "print(dataset[\"train\"].features)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "87A6FfSsMveZ"
   },
   "outputs": [],
   "source": [
    "print(dataset)\n",
    "\n",
    "# setup dataset configuration\n",
    "INPUT_COL_1 = \"sentence\"\n",
    "INPUT_COL_2 = None\n",
    "LABEL_COL = \"label\"\n",
    "NUM_LABELS = len(dataset[\"train\"].unique(LABEL_COL))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "x3v3WFrHFLoO"
   },
   "outputs": [],
   "source": [
    "MAX_LEN = 128\n",
    "def preprocess_fn(examples):\n",
    "  args = None\n",
    "  if INPUT_COL_2 is None:\n",
    "    args = (examples[INPUT_COL_1], )\n",
    "  else:\n",
    "    args = (examples[INPUT_COL_1], examples[INPUT_COL_2])\n",
    "  result = tokenizer(*args, \n",
    "                   padding=\"max_length\", \n",
    "                   max_length=min(tokenizer.model_max_length, MAX_LEN), \n",
    "                   truncation=True)\n",
    "  return result\n",
    "\n",
    "tokenized_dataset = dataset.map(\n",
    "    preprocess_fn,\n",
    "    batched=True,\n",
    "    desc=\"Running tokenizer on dataset\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "19mnPsKHN_y1"
   },
   "source": [
    "## **Step 6: Run Training**\n",
    "\n",
    "SparseML has a custom `Trainer` class that inherits from the [Hugging Face `Trainer` Class](https://huggingface.co/docs/transformers/main_classes/trainer). As such, the SparseML `Trainer` has all of the existing functionality of the HF trainer. However, in addition, we can supply a `recipe` and (optionally) a `teacher`. \n",
    "\n",
    "\n",
    "As we saw above, the `recipe` encodes the sparsity related algorithms and hyperparameters of the training process in a YAML file. The SparseML `Trainer` parses the `recipe` and adjusts the training workflow to apply the algorithms in the recipe.\n",
    "\n",
    "The `teacher` is an optional argument that instructs SparseML to apply model distillation to support the training process."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "bCi5ZL57Lixc"
   },
   "outputs": [],
   "source": [
    "training_args = TrainingArguments(\n",
    "    output_dir=\"./training_output\",\n",
    "    do_train=True,\n",
    "    do_eval=True,\n",
    "    resume_from_checkpoint=False,\n",
    "    evaluation_strategy=\"epoch\",\n",
    "    save_strategy=\"epoch\",\n",
    "    logging_strategy=\"epoch\",\n",
    "    save_total_limit=1,\n",
    "    per_device_train_batch_size=32,\n",
    "    per_device_eval_batch_size=32,\n",
    "    fp16=True)\n",
    "\n",
    "trainer = Trainer(\n",
    "    model=model,\n",
    "    model_state_path=model_path,\n",
    "    recipe=recipe_path,\n",
    "    teacher=teacher,\n",
    "    metadata_args=[\"per_device_train_batch_size\",\"per_device_eval_batch_size\",\"fp16\"],\n",
    "    args=training_args,\n",
    "    train_dataset=tokenized_dataset[\"train\"],\n",
    "    eval_dataset=tokenized_dataset[\"validation\"],\n",
    "    tokenizer=tokenizer,\n",
    "    data_collator=default_data_collator,\n",
    "    compute_metrics=compute_metrics)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "id": "bNrip0sYifOE"
   },
   "outputs": [],
   "source": [
    "train_result = trainer.train(resume_from_checkpoint=False)\n",
    "trainer.save_model()  # Saves the tokenizer too for easy upload\n",
    "trainer.save_state()\n",
    "trainer.save_optimizer_and_scheduler(training_args.output_dir)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Step 7: Export To ONNX**\n",
    "\n",
    "Run the following to export the model to ONNX. The script creates a `deployment` folder containing ONNX file and the necessary configuration files (e.g. `tokenizer.json`) for deployment with DeepSparse."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "-rhWjiHBeR7M"
   },
   "outputs": [],
   "source": [
    "!sparseml.transformers.export_onnx \\\n",
    "  --model_path training_output \\\n",
    "  --task text_classification"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Next Steps**\n",
    "\n",
    "Checkout the DeepSparse repository for more details on deploying your sparse models with GPU class performance on CPUs!"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "authorship_tag": "ABX9TyP+TewvzeFsk4xSqAYcYfFE",
   "provenance": [
    {
     "file_id": "1Zawa0sifXr2wIl9tbF7ySJ7xYY0dtTzI",
     "timestamp": 1677193660159
    }
   ]
  },
  "gpuClass": "standard",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  },
  "vscode": {
   "interpreter": {
    "hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
